Skip to main content

All Questions

1vote
0answers
102views

Confused about use of random states for training models in scikit

I am new to ML and currently working on improving the accuracy of an MLPClassifier in scikit. My code looks like so ...
Leandro's user avatar
3votes
1answer
636views

Why does scikit's cross-validation return a negative R^2 for my strongly correlated data

I have exactly the following preprocessed data in a small Pandas dataframe: ...
Arepo's user avatar
0votes
1answer
37views

Sklearn EstimatorCV vs GridSearchCV

sklearn has the following description for EstimatorCV estimators: https://scikit-learn.org/stable/glossary.html#term-cross-validation-estimator An estimator that has built-in cross-validation ...
wannabedatascientist's user avatar
2votes
1answer
31views

Scoring function in cross-validation often left default

I'm a PhD student applying ML in microbiology. In research papers, the usual performance measure reported on classification models is ROC-AUC. But when I look at implementations, the scoring function ...
alepfu's user avatar
1vote
1answer
91views

How do I identify overffiting when using GridSchearCV?

For context, I'm using Scikit Learn's GridSearchCV to find the best Hyperparameters of a Decision Tree. I believe I understand Train, Validation, and Test sets and overfitting concepts when applied ...
Lisana Daniel's user avatar
0votes
0answers
23views

How to use cross validation to select/evaluate model with probability score as the output?

Initially I was evaluating my models using cross_val with out-of-pocket metrics such as precision, recall, f1 score, etc, or with my own metrics defined in ...
szheng's user avatar
1vote
1answer
175views

integration of Feature Selection in Pipeline

I have noticed integrating feature selection in a pipeline alters results. Pipeline 1 gives slightly different results with pipeline 2. Why should this be so? Pipeline 2 ...
wwnde's user avatar
0votes
1answer
90views

error when using KFold() and roc_auc metric

why cross_val_score(pipe,X,y,scoring="roc_auc",cv=StratifiedKFold()) works just fine and when using KFold() like ...
jxqbbb's user avatar
0votes
1answer
629views

Tuned model has higher CV accuracy, but a lower test accuracy. Should I use the tuned or untuned model?

I am working on a classification problem using Sci Kit Learn and am confused on how to properly tune hyper parameters to get the "best" model. Before any tuning, my logistic regression ...
d0dg3r_k1d's user avatar
0votes
1answer
102views

how do I test if overfitting exists when I use cross_val_score method?

I got the following code form a book on xgboost. I wonder whether this is a correct way of analyzing cross validation score for overfitting purposes. mean accuracy is 81 which can be okay. but what if ...
Mehmet Deniz's user avatar
1vote
1answer
853views

Does sklearn perform feature selection within cross validation?

I would like to add a feature selector on my pipeline and use gridsearchcv to tune both the hyperparameters of the selector and the classifier(s). I am wondering if sklearn performs feature selection ...
Antonios Sarikas's user avatar
0votes
1answer
1kviews

Is there any benefit to using cross validation from the XGBoost library over sklearn when tuning hyperparameters?

The XGBoost library has its own implementation of cross validation through xgboost.cv(). It looks like it requires data be stored as a DMatrix. Instead of using <...
Eli's user avatar
  • 101
0votes
1answer
87views

Confusion regarding K-fold Cross Validation

In K fold cross validation, we divide the dataset into k folds, where we train the model on k-1 folds and test the model on the remaining fold. We do so until all the folds were assigned as the test ...
AAA's user avatar
  • 145
0votes
1answer
241views

Can I use scikit learn's cross_val_predict with cross_validate?

I am looking to make a visualization of my cross validation data in which I can visualize the predictions that occurred within the cross validation process. I am using scikit learn's cross_validate to ...
lambdaChops's user avatar
4votes
0answers
92views

Does ROC AUC different between crossval and test set indicate overfitting or other problem?

I am training a composite model (XGBoost, Linear Regression, and RandomForest) to predict injured people probability. Well, the results of cross-validation with 5 folds. Well, I can see any problem ...
GregOliveira's user avatar

153050per page
close